\documentclass[prodmode,acmjetc]{acmsmall}

%------------------------------------------------------------------------------

\usepackage{cite}
\usepackage{amsmath}

% \newtheorem{definition}{Definition}
% \usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{graphicx}
\usepackage{url}
\usepackage{cite}
\usepackage{balance}
% \usepackage{float}
% \usepackage{threeparttable}
% \usepackage{multirow}
% \usepackage{epstopdf}
% \usepackage{mathptmx}
% \usepackage[scaled=.90]{helvet}
% \usepackage{courier}
\usepackage{listings}
\lstset{
   language=C,
   basicstyle=\small,
   keywordstyle=\bfseries,
   identifierstyle=\ttfamily,
   stringstyle=\ttfamily,
   numbers=left,
   numberstyle=\tiny,
   stepnumber=1,
   numbersep=-5pt,
   showstringspaces=false
%   frame=single %trbl%
}

% \usepackage[normalem]{ulem}



%------------------------------------------------------------------------------

\newcommand{\etal}{\emph{et al.}}
\newcommand{\eg}{\emph{e.g.}}
\newcommand{\ie}{\emph{i.e.}}
\newcommand{\etc}{\emph{etc.}}
\newcommand{\cf}{\emph{cf.}}

%------------------------------------------------------------------------------

\begin{document}

\markboth{A. Mineo et al.}{A Runtime Tunable Transmitting Power Technique for Improving Energy Efficiency in mm-Wave WiNoC Architectures}

%------------------------------------------------------------------------------

\title{A Runtime Tunable Transmitting Power Technique for Improving
  Energy Efficiency in mm-Wave WiNoC}
% {Transmitter Power Aware in mm-Wave WiNoC} 

\author{ANDREA MINEO
\affil{University of Catania}
MAURIZIO PALESI
\affil{Kore University}
GIUSEPPE ASCIA
\affil{University of Catania}
VINCENZO CATANIA
\affil{University of Catania}}


%% \author{Andrea~Mineo~\IEEEmembership{Student Member,~IEEE,} Maurizio~Palesi,~\IEEEmembership{Member,~IEEE,}
%%   Giuseppe~Ascia, and~Vincenzo~Catania\thanks{A.~Mineo, G.~Ascia and
%%     V.~Catania are with the Dipartimento di Ingegneria Elettrica,
%%     Elettronica e Informatica, University of Catania, Catania, Italy
%%     (email: \{gascia,vcatania\}@dieei.unict.it). M.~Palesi is with
%%     Kore University, Enna (email: maurizio.palesi@unikore.it).}}

%% \maketitle

%---------------------------------------------------------------------

\begin{abstract}
In the last few years, many commercial multiprocessors System-on-Chip
(MPSoCs) which use a Network-on-Chip (NoC) as interconnection fabric
have been released from leading chip vendors like TILERA and Intel. In
modern CMOS technologies, the integration density continues to
increase while limitations due to the wiring interconnect become a
bottleneck especially in multi-hop intra-chip communications. Emerging
architectures, such as Wireless NoC (WiNoC), represent the candidate
solutions to deal with the communication latency issues that
characterise such many-core architectures. In WiNoC, metallic wires
are replaced with long-range radio interconnections. Unfortunately,
the energy consumed by the RF transceiver (\ie, the main building
block in a WiNoC), and in particular by its transmitter, accounts for
a significant fraction of the overall communication energy.  In order
to alleviate such contribution, this paper proposes a runtime tunable
transmitting power technique for improving the energy efficiency of
the transceiver in WiNoC architectures. The basic idea is tuning the
transmitting power based on the physical location of the recipient of
the current communication. The integration of the proposed technique
into two main WiNoC families, namely, mesh topology-based and
mm-wave small-world (mSWNoC), resulted in an average energy reduction
of 50\% and 20\% respectively. The application of the proposed
technique does not affect the performance metrics of the WiNoC in
which it is used. In addition, the area overhead for its
implementation is negligible as compared to the area of the RF
transceiver.
\end{abstract}

%------------------------------------------------------------------------------

\category{C.2.2}{Computer-Communication Networks}{Network Protocols}

\terms{Design, Algorithms, Performance}

\keywords{Wireless sensor networks, media access control,
multi-channel, radio interference, time synchronization}

%------------------------------------------------------------------------------

\begin{bottomstuff}
Author's addresses: A. Mineo, G. Ascia, V. Catania, Dipartimento di
Ingegneria Elettrica, Elettronica e Informatica, University of
Catania, Italy; M. Palesi, Facolt{\`a} di Ingegneria e Architettura,
University of Enna, KORE, Italy.
\end{bottomstuff}

\maketitle

%------------------------------------------------------------------------------

\section{Introduction}
Multiprocessors System-on-Chip (MPSoCs) which use a Network-on-Chip
(NoC) as interconnection fabric, are now a commercial reality. For
instance, TILERA has released a 72-core processor~\cite{tilera_72},
while Intel has leveraged its research results with Teraflop and
SCC~\cite{vangal_jssc08,intel_scc}, releasing a series of coprocessors
named Xeon Phi~\cite{intel_xeonphi} with 61 cores which use a ring
based NoC as on-chip communication backbone. Adapteva has launched in
the market a multicore parallel computing fabric which consists of a
2D array of compute nodes connected by a low-latency mesh
NoC~\cite{adapteva64}. 

As the number of cores integrated into the same chip increases, the
role played by the on-chip communication system becomes more and more
important. The cost (\ie, silicon area), the performance (\eg,
communication delay, throughput, \etc), and the energy consumption of
the NoC are common design optimization metrics. For instance, with
regard to the communication performance, as the network size
increases, due to the multi-hop communication nature of NoC-based
systems, the communication latency increases. Another issue regards
the metal global wires used in traditional CMOS technology.  In
particular, while the integration density and the devices speed
increas, on the other hand the electrical interconnections become the
bottleneck in terms of delay and power
consumption~\cite{ho_poieee01}. To face with this problem,
three-dimensional integration, nanophotonic communication, and
RF/wireless interconnects are emerging as technological alternatives
to the metal/dielectric system. In particular, RF/wireless
interconnects can be divided in two main families called
RF-I~\cite{chang_hpca08} and Wireless NoC (WiNoC)~\cite{zhao_tc08}.

The first one is based on the propagation, at the speed of light
(effective), of an electromagnetic wave through a waveguide formed by
two close conductors using standard CMOS technology.  The waveguide
acts as an highway for the travelling information.  Although RF-I
solution has demonstrated its effectiveness in terms of latency and
low power dissipation~\cite{chang_micro08}, its performance does not
scale as the number of communicating cores increases.

Scalability issues are solved by WiNoC
architectures. WiNoC~\cite{zhao_tc08} use a wireless backbone upon the
traditional wire-based NoC~\cite{deb_jetcas12}. A WiNoC introduces new
hardware structures such as antenna and transceivers, that represent
an overhead in terms of area and power. The use of concentrated
architectures and hierarchical topologies~\cite{ditommaso_hoti11} is a
viable solution to deal with the antenna area overhead issue. With
regard to the power issue, the major contribution is due to the radio
transmitter front-end connected to the antenna. For instance,
in~\cite{yu_mwscas11} the transmitter is responsible for about 65\% of
the overall transceiver power consumption, while in~\cite{daly_jssc07}
this contribute is more than 74\%. Previous work in the context of
WiNoCs assume transmitters in which the transmitting power is
constant (regardless the distance of the destination node), and able
to guarantee a given reliability level (in terms of bit error rate,
BER) in the worst case.

In this paper we propose a novel mechanism for improving the energy
efficiency of the transmitters in WiNoC architectures. The basic idea
is allowing the transmitter to run-time set its transmitting power
based on the reliability requirements and the destination node of the
current communication. We provide a systematic approach that, under a
reliability constraint (given in terms of maximum BER) and for each
antenna, allows to determine the optimal transmitting power for each
destination node. The optimal transmitting power is off-line computed
by using an accurate 3D field solver for a limited number of
measurements. The obtained power figures are then used for configuring
the proposed variable gain controller which is responsible for driving
the power amplifier connected to the transmitting antenna. We found
that, by integrating the proposed technique into two known Mesh
Topology-based WiNoC architectures, namely,
iWise64~\cite{ditommaso_hoti11}, McWiNoC~\cite{zhao_nocs11}, and
mSWNoC~\cite{deb_tc13} results in an energy reduction of 48\%, 50\%,
and 20\%, respectively.

%% Finally
%% the proposed architecture has also been proven for a specific
%% millimeter-wave small-world wireless NoC (mSWNoC) architecture
%% obtaining an average saving of 19\%.

%------------------------------------------------------------------------------

\section{Preliminaries and Related Work}
\label{sec:related}
\begin{table}
  \centering
  \tbl{ITRS projections for the transition frequency $f_t$ and maximum 
  oscillating frequency $f_{max}$\cite{itrs_rfams12}.\label{tab:itrs}}{%
  \begin{tabular}{lcccccc}
    \hline
    Year        & 2012  & 2013 & 2014 & 2015 & 2016 & 2017  \\
    \hline
    $ f_t $(GHz)    & 315   & 315  & 345  & 360  & 375  & 390   \\
    $ f_{max}$(GHz)   & 420   & 455  & 490  & 525  & 560  & 595   \\
    \hline
  \end{tabular}}
\end{table}
The possibility of radio communications inside the chip is a novel
technique born initially for distributing clock signals inside the
chip for reducing clock skew related problems~\cite{floyd_jssc02}. The
main drawback until then, was the capability of integrating an antenna
in a standard silicon substrate compatibly with the CMOS technology.
This is linked by the capability for transistors of operating at high
frequencies. Tab.~\ref{tab:itrs} shown the trend for the cut-off
and oscillating frequency for MOS transistors as foreseen by the
International Technology Roadmap for Semiconductors
(ITRS)~\cite{itrs_rfams12}. The meaning of such projection is that,
over the time, the active devices can operate at higher and higher
frequencies.  Since, the dimension of an antenna has to be
comparable with the wavelength, the first consequence of an higher
operating frequency is that the dimension of an antenna will decrease.

For instance, the dimension of a dipole antenna (simply formed by two
conductors) operating at 60~GHz would have a length of $\mathrm{632
  \times 2 \ \mu m}$ when integrated in a silicon
substrate~\cite{gutierez_jsac09}; while if operating in 5.8~GHz, the
dimension increases to $\mathrm{6.5 \times 2 \ mm}$, which is
comparable with the entire die size. Furthermore, the scaling is not
only limited to the antenna but it also affects the passive elements
inside the main building blocks of the RF front-end which are
responsible for a relevant fraction of its silicon area.

\begin{figure}
  \centering
  \includegraphics[width=0.30\textwidth ,angle=270]{pictures/lna.eps}
  \caption{Common Source RF amplifier.}
  \label{fig:rf_amplifier}
\end{figure}
To have a quantitative idea of the effects of the scaling on some of
the passive elements which form the RF front-end, let us consider the
RF-amplifier shown in Fig.~\ref{fig:rf_amplifier}. One of the design
steps is the sizing of the $CL$ group in order to resonate at the
center of the operating band. To do this, it needs to set the
admittance to zero at the target frequency (center of band), that is:
\begin{equation}
   Y_C + Y_L = j \omega_c C_p + \frac{1}{j\omega_c L}=j\bigg(\omega_c
C_p - \frac{1}{\omega_c L}\bigg) = 0, 
  \label{eq:ammettance}
\end{equation}
where $C_p$ is the parasitics capacitance which depends on the active
devices and on the other effects such as the contact capacitance and
the parasitics capacitance introduced by the inductor itself, and
$\omega_C$ is the operating frequency. Thus, solving
Eqn.~(\ref{eq:ammettance}) by $L$, we have:
\begin{equation}
  L = \frac{1}{\omega^2_c C}.
  \label{eq:set_ammettance}
\end{equation} 
From Eqn.~(\ref{eq:set_ammettance}) can be observed that, as the
frequency increases, the value of the inductance can be decreased.
For instance, as reported in~\cite{chang_hpca08}, at 20~GHz the size
of the inductor is approximately $\mathrm{50\mu m \times 50 \mu m}$
while at 400~GHz it can be reduced to $\mathrm{12\ \mu m\times 12
  \ \mu m}$. 

Based on the above considerations, several research groups have proven
the possibility of integrating every building block of the RF front
end (including the antenna) into the same
chip~\cite{floyd_jssc02,o_ted05,lin_jssc07}.

In the context of on-chip communication, the capability of integrating
an antenna with its transceiver into a silicon die~\cite{lin_jssc07}
has lead several research groups on assessing the advantages of having
long range wireless links upon the traditional wire-based NoCs. An
exhaustive panoramic of the state-of-the-art in WiNoC architectures
can be found in~\cite{deb_jetcas12}. Here, the authors divide the
various WiNoC architectures into two main classes, namely, mesh based
topology and small-world based topology WiNoCs. Another classification
can be made on the basis of the portion of electromagnetic spectrum
used for data transmission such as UWB~\cite{zhao_tc08} (few GHz),
mm-wave~\cite{deb_asap10,ditommaso_hoti11 ,deb_isqed12,deb_glsvlsi12}
(tens of GHz), sub-THz~\cite{lee_mobicom_09} (hundreds of GHz), and
THz~\cite{ganguly_tc10} NoC. While the first three use the
metallization present in standard CMOS technology as antenna, the
latter make use of carbon nanotubes.

\begin{figure}
  \centering
  \includegraphics[width=0.4\textwidth]{pictures/zig_zag.eps}
  \caption{The zigzag antenna.}
  \label{fig:zigzag}
\end{figure}
In mm-wave WiNoCs, zigzag antenna (Fig.~\ref{fig:zigzag}) is viewed as
the best candidate solution for on-chip antenna~\cite{deb_jetcas12}.
A zigzag antenna for the mm-wave, can be designed and characterized
with yet consolidated techniques and knowledge such as the use of
field solvers. Furthermore, the use of regular topologies, like 2D
meshes, allows the exploitation of symmetries that simplify their
characterization. Several examples of mesh based WiNoC architectures
can be found in
literature~\cite{lee_mobicom_09,ditommaso_hoti11,zhao_nocs11,wang_pdp11}. In
this context, the most used modulation technique is the Amplitude
Shift Keying or On Off Keying (ASK-OOK)~\cite{deb_asap10,
  ditommaso_hoti11,deb_jetcas12}. Although, for a given bit error rate
(BER), the ASK-OOK modulation requires a higher transmitting power
than that required by other modulation techniques (\eg, the Quadrature
Amplitude Modulation (QAM)~\cite{couch2007digital}), and has a poor
spectral efficiency, its hardware implementation is simple (low area
overhead as compared to QAM) and tailored to be applied in the on-chip
context. In this paper, we propose a technique and circuitry for
improving the power efficiency of a ASK-OOK transceiver by means of a
reliability aware on-line transmitting power modulation.

%------------------------------------------------------------------------------

\section{Adaptive Transmitting Power Transceiver}
\label{sec:proposed}
This section presents the proposed adaptive transmitting power
transceiver which adaptively determines the optimal transmitting
power, based on the packed destination address, under reliability
constraints expressed in terms of maximum allowed communication bit
error rate.

%------------------------------------------------------------------------------

\subsection{Variable Gain Amplifier Controller}
Traditional transceivers in WiNoC architectures use the same
transmitting power regardless of the distance (location) of the
destination node. In fact, the transmitting power is set for the worst
case under a reliability (\ie, maximum BER) constraint. We propose to
runtime select the minimum transmitting power based on the physical
location of the destination node of the current communication. Of
course, the selected minimum transmitting power must be high enough to
met the communication reliability constraints in terms of BER.

\begin{figure}
  \centering
  \includegraphics[width=0.50\textwidth]{pictures/blocks.eps}
  \caption{Scheme of the proposed adaptive transmitting power
    transceiver.}
  \label{fig:tx_scheme}
\end{figure}
The general scheme of the proposed adaptive transmitting power
transceiver is shown in Fig.~\ref{fig:tx_scheme}. As compared to a
traditional transceiver, it makes use of a tunable power amplifier
(PA)~\cite{daly_jssc07} controlled by a variable gain amplifier (VGA)
controller. Although dynamically tuning the transmitting power is an
established technique in the context of radio communications (\eg,
mobile phones, wireless sensors network, \etc), its implementation
requires sophisticated controller policies hardly replicable in the
WiNoC domain. Thus, the proposed VGA controller uses the destination
address of the packet for accessing a look-up table containing the
configuration words used for configuring the PA. For a given
destination, the associated configuration word enables the PA to use
the minimum transmitting power to reach that destination by ensuring a
specific reliability level in term of BER. Such optimal transmitting
power is computed offline as it will be discussed in the next
subsection.

%------------------------------------------------------------------------------

% \subsection{Friis Transmission Equation}
% \label{ssec:friis}
% \subsection{Signal Strength Requirements}
\subsection{Determining the Minimal Transmitting Power under a BER Contraint}
\begin{figure}
  \centering
  \includegraphics[width=0.45\textwidth]{pictures/friis.eps}
  \caption{Friis transmission equation: geometrical orientation of
    transmitting and receiving antennas. As indicated, considering a
    spherical coordinate system, $\phi$ is the azimuthal angle in the
    XY plane, where the X axis is $0^\circ$ and Y axis is
    $90^\circ$. $\theta$ is the elevation angle where the Z-axis is
    $0^\circ$, and the XY plane is $90^\circ$.}
  \label{fig:friis}
\end{figure}
The required transmitting power depends on many factors, including,
the kind of modulation, the transceiver noise figure, and the
attenuation introduced by the wireless medium. Let us consider the
Fig.~\ref{fig:friis} which shows a transmitting antenna with an output
power $P_t$ and a relative angle respect the receiving antenna of
$(\theta_t,\phi_t)$, and a receiving antenna, located at distance $R$,
with a relative angle respect the transmitting antenna of
$(\theta_r,\phi_r)$. The fraction of the transmitting power that
reaches the terminal of the receiving antenna, $P_r$, can be computed
by the Friis transmission
equation~\cite{balanis2008modern} valid when $R>2D^2/\lambda$, where
$D$ is the the maximum dimension of antenna (axial length in our case)
and $\lambda$ is the wavelength. The Friis equation is:
\begin{equation}
  \begin{split}
  G_a &= \frac{P_r}{P_t} = \\ 
      &= e_t e_r \frac{\lambda^2 D_t(\theta_t,\phi_t)D_e(\theta_r,\phi_r)}{(4\pi R)^2}\cdot(1-|\Gamma_t|)(1-|\Gamma_r|)|\hat{\rho_t}\cdot \hat{\rho_r}|
  \end{split}
  \label{eq:friis_complex}
\end{equation} 
where:
\begin{itemize}
  \item $e_t$ and $e_r$ are the efficiencies of the transmitting and
    receiving antenna, respectively. These parameters mainly represent
    the signal losses in the silicon substrate. For reducing such
    contribution, high resistivity Silicon on Insulator (SoI)
    substrates ($>1~\mathrm{K \Omega cm}$) can be
    used~\cite{montusclat_ecwt05} or a polyamide stratus (few micron
    thick) can be inserted under the antenna~\cite{lee_mobicom_09}.
  
  \item $D_t$ and $D_r$ are the directivities of the transmitting and
    receiving antenna, respectively. They quantify how much better the
    antenna can transmit or receive in a specific direction.
  
  \item $\lambda$ is the effective wavelength. For an IC substrate, it
    is estimated by using the material properties of the top IC layers
    (silicon dioxide $\epsilon_r=3.9$)~\cite{gutierez_jsac09}.

  \item $|\Gamma|$ refers to the portion of the transmitting/receiving
    power that returns to the transceiver due to impedance mismatch
    (ideally $|\Gamma|=0$).  This parameter is known as reflection
    coefficient.

  \item $|\hat{\rho_t} \cdot \hat{\rho_r}|$ takes into account the
    polarization status of the emitted EM wave (ideally, it is equal
    to one).
\end{itemize}
Eqn.~(\ref{eq:friis_complex}) highlights the
parameters which determine the gain $G_a$. However, in practical
cases, $G_a$ computation is performed by means of
Eqn.~(\ref{eq:friis_measured}).
\begin{equation}
  G_a=\frac{P_r}{P_t}=\frac{|S_{12}|}{(1-|S_{11}|)(1-|S_{22}|)}
  \label{eq:friis_measured}
\end{equation}
where, $S_{11}$, $S_{12}$, and $S_{22}$ are the scattering
parameters. Such parameters are computed by using field solver
simulation tools~\cite{floyd_jssc02} or by direct measures from
realized prototypes.

Using Eqn.~(\ref{eq:friis_measured}) it is possible to estimate the signal
attenuation due to the wireless medium.  Since the
communication reliability is related to the energy per bit, $E_b$, spent to
reach the receiver's antenna, we can determine the power required by
the transmitter for each value of attenuation $G_a$. In particular,
for the ASK-OOK modulation the bit error rate can be computed as:
\begin{equation}
  BER=Q\bigg( \sqrt{\frac{E_b}{N_0}}\bigg)
  \label{eq:ber}
\end{equation}
where $N_0$ is the transceiver noise spectral density and the $Q$
function is the tail probability of the standard normal distribution
defined by Eqn.~(\ref{eq:q_func}).
\begin{equation} 
  \label{eq:q_func}
  Q(x)=\frac{1}{\sqrt{2\pi}}\int_{x}^{\infty} e^{-\frac{y^2}{2}}dy   
\end{equation}

Since $E_b=P_r/R_b$, where $P_r$ is the power received at the terminal
of the receiver antenna while $R_b$ is the data rate, we can compute
the required transmitting power for a given data rate and BER
requirement and for a given transceiver's thermal noise as:
\begin{equation}
  P_r = E_b \cdot R_b = \left[Q^{-1}(BER)\right]^2 N_0 R_b
  \label{eq:pr}
\end{equation}
where $Q^{-1}$ is the inverse of the $Q$ function.

Thus, the minimum transmitting power to reach a certain receiver
guaranteeing a maximum BER can be computed as:
\begin{equation}
  P_t(dBm) = P_r(dBm) - G_a(dB)
  \label{eq:pt}
\end{equation}
where $P_r(dBm)$ is given by Eqn.~(\ref{eq:pr}) while $G_a(dB)$ is
computed by using a field solver with the Friis formula when power is
expressed in dBm.\footnote{The absolute power, $P$, can be expressed
  in dBm by $P_{dBm} = 10 \cdot \log{(P \cdot 10^3)}$}

%------------------------------------------------------------------------------

\subsection{Overall Flow}
\label{ssec:pmap_det}
Let us now provide the basic steps needed for determining the optimal
transmitting power for each node pairs. For the sake of clarity,
we consider a networks in which the radio hubs are arranged in a mesh
fashion. This makes more simple the characterization of the antennas
as symmetries can be exploited.
%% Such transmitting power information are clustered with a
%% certain granularity based on the number of power steps used by the PA
%% and stored into a lookup table in the VGA controller. The basic steps
%% are summarized as follows.
\begin{enumerate}
  \item Compute the attenuation map. For each pair \textless
    transmitting antenna, receiving antenna\textgreater, extract the
    scattering parameters $S_{11}$ and $S_{22}$ and compute the gain
    by means of Eqn.~(\ref{eq:friis_measured}). In case of the
    availability of a test-chip, the scattering parameters $S_{11}$
    and $S_{22}$ can be directly measured by means of a network
    analyzer~\cite{o_ted05}.
  
  \item Compute the Power map. For each pair transmitting
    antenna $i$, receiving antenna $j$, based on the required
    transmission data rate and the maximum allowed BER, use
    Eqn.~(\ref{eq:pr}) and then Eqn.~(\ref{eq:pt}) for computing the
    minimum transmitting power that met the BER constraint. Let us
    indicate this transmitting power value with $PM(i,j)$.

  %% \item Cluster the optimal transmitting power set. By means of a
  %%   clustering method (\eg, K-Means), cluster the optimal transmitting
  %%   power values computed in the previous step in a number of clusters
  %%   equal to the chosen number of power steps. Let us indicate with
  %%   $PS={ps_1, ps_2, \ldots, ps_n}$ the set of power steps represented
  %%   by the centroids of the clusters.

  \item Determining the power steps. Let $n$ be the number of desired
    power steps and $PM_{min}$, $PM_{max}$ the minimum and maximum
    value of $PM$, respectively. The set of power steps $PS=\{ps_1,
    ps_2, \ldots, ps_n\}$ is defined by dividing the interval
    $[PM_{min}, PM_{max}]$ in $n$ equally spaced levels for which the
    $i$-th power step is:
    \[ ps_i = PM_{min} + (i-1) \times \frac{PM_{max} - PM_{min}}{n-1}. \]

  \item Configure the VGA controller. Upload the look-up table in each
    VGA controller as follows. Let $LUT_i$ be the look-up table in the
    VGA controller of radio hub $i$. $LUT_i(j)$ encodes the power step
    to be used to transmit to radio hub $j$. Such power step is
    selected as the minimum $ps \in PS$ such that $PM(i,j) \le ps$.
\end{enumerate}
In the following section we will assess the effectiveness of the
proposed technique in communication energy consumption reduction.

%------------------------------------------------------------------------------

%% \subsection{The Mapping Problem}
%% \label{sec:mapping}
%% Several works in literature have shown the effectiveness of mapping
%% techniques for improving different metrics including performance and
%% energy consumtion~\cite{}. The role played by the mapping becomes even
%% more important in the context of WiNoCs due to the possibility of
%% exploiting a additional degrees of freedom, including, the association
%% between the radio hub and the cluster of concentrated cores, the
%% directionality of the antenna, the number of radio channels to be
%% used, etc. In this paper, we explore one of the aforementioned new
%% mapping dimensions, namely, the association between the radio hub and
%% the cluster of concentrated cores. Specifically, the mapping problem,
%% shown in Fig.~\ref{fig:mapping_description}, is formulated as follows.
%% \begin{figure}
%%   \centering
%%   \includegraphics[width=0.45\textwidth]{pictures/mapping_description.eps}
%%   \caption{The mapping process.}
%%   \label{fig:mapping_description}
%% \end{figure}

%% Let $NG=G(R, RR, L_{R}, L_{RR})$ be the \emph{network graph} where $R$
%% is the set of routers, $RR$ is the set of radio routers, $L_{R}$ is
%% the set of links connecting the routers in $R$, and $L_{RR}$ is the
%% set of links connecting the radio routers in $RR$ with the routers in
%% $R$. We assume that all the links $l_{R} \in L_{R}$ and $l_{RR} \in
%% L_{RR}$ have the same bandwidth capacity $cap$ and the same energy
%% consumption per bit $e_l$. Let $PE$ the set of
%% processing elements. We assume direct networks for each there is a
%% router for each processing element.

%% Let $e_r(rr_s,rr_d)$, with $rr_s,rr_d \in RR$, be the \emph{radio
%%   transmission energy function} which provides the minimum
%% transmission energy per bit for a radio communication from radio
%% router $rr_s$ to radio router $rr_d$.

%% Let $AG=G(T,C)$ be the \emph{application graph} where $T$ is the set
%% of tasks and $C$ is the set of communications among tasks. Let $bnd(c)$
%% and $vol(c)$ be the \emph{communication bandwidth} (in bit/sec) and the
%% \emph{communication volume} (in bit) of communication $c \in C$,
%% respectively.


%% Based on the above definitions, the mapping problem can be formulated
%% as follows. Find a \emph{mapping function}, $map:T \rightarrow PE$,
%% such that the communication energy is minimised and the bandwidth
%% constraints are met. The communication energy, $E$, is the the product
%% between the communication volume and the total energy per bit spent on
%% links and radio transmissions over all the communications. It is
%% computed as follows:
%% \begin{equation}
%%   \begin{aligned}
%%   E =& \sum_{\substack{c=(t_s,t_d) \in C \\ pe_s=map(t_s)
%%         \\ pe_d=map(t_d)}} vol(c) \big[ |LT(pe_s,pe_d)| e_l + \\
%%        & + \sum_{(rr_s,rr_d) \in RRP(pe_s,pe_d)} e_r(rr_s,rr_d) \big],
%%   \end{aligned}
%%   \label{eqn:mapping_energy}
%% \end{equation}
%% where $LT(pe_s,pe_d)$ returns the set of links traversed for the
%% communication between $pe_s$ and $pe_d$, and $RRP(pe_s,pe_d)$ returns
%% the set of radio router pairs (transmitter, receiver) involved in the
%% communication between $pe_s$ and $pe_d$.

%% The bandwidth constraints refer to the fact that the aggregated
%% bandwidth on links cannot exceed their capacity. That is:
%% \[ \sum_{\substack{c=(t_s,t_d) \in C \\ pe_s=map(t_s)
%%     \\ pe_d=map(t_d)}} bnd(c) \times PT(pe_s,pe_d,l) \leq cap \quad
%% \forall l \in L_R, \] where $PT(pe_s,pe_d,l)$ is the pass-through
%% function which returns 1 if $l$ belongs to the routing path for the
%% communication between $pe_s$ and $pe_d$ and 0 otherwise. That is,
%% $PT(pe_s,pe_d,l)=1$ if $l \in LT(pe_s,pe_d)$.

%% Differently from the traditional mapping techniques proposed in
%% literature~\cite{}, here, the mapping selection depends also by the
%% location of the radio routers which is accounted by the radio
%% transmission energy function $e_r$ in
%% Eqn.~(\ref{eqn:mapping_energy}). Such additional degree of freedom
%% results in new opportunities for energy optimization as it will be
%% shown in the experiments section.

%------------------------------------------------------------------------------

\section{Experiments}
\label{sec:experimental}
In this section we present the results of experiments considering a
mesh-based WiNoC on a $\mathrm{20 \ mm \times 20 \ mm}$ silicon die.  A
zigzag antenna has been accurately modeled and characterized with
Ansoft HFSS~\cite{hfss} (High Frequency Structural Simulator). HFSS is
a leading commercial finite element method (FEM) field solver which
simulates 3D structures and produces S-parameters and radiation
patterns. We considered an high resistivity $\rho=5~\mathrm{K\Omega
  cm}$ SOI with a substrate thickness of $350~\mathrm{\mu m}$ and
$30~\mathrm{\mu m}$ for the oxide ($SiO_2$). The antennas are situated
at an elevation of $2~\mathrm{\mu m}$ from the substrate, compatibly
with the guidelines reported in~\cite{seok_iitc05} for reducing the
interference with others metal structures (\cite{seok_iitc05}
demonstrates that the interference due to other metallic structures is
negligible by following such rules).  The zigzag antenna has a
thickness of $2~\mathrm{\mu m}$ and an axial length of $2 \times
340~\mathrm{\mu m}$ for operating at around 60~GHz. The same setup has
been used in~\cite{montusclat_ecwt05}.

From HFSS simulation we obtain the scattering parameters ($S_{11}$ and
$S_{12}$) used for computing the Friis formula and then for
calculating the attenuation introduced by the wireless medium. In
particular, $S_{11}$ is also used for determining the antenna
bandwidth as discussed in the following subsection.

%------------------------------------------------------------------------------

\subsection{Bandwidth and Radiation Pattern}
\begin{figure}
  \centering
  \includegraphics[width=0.45\textwidth]{pictures/s11.eps}
  \caption{$S_{11}$ parameter of the zigzag antenna. The bandwidth is the range 
  of frequency under -10 dB.}
  \label{fig:s11}
\end{figure}
Fig.~\ref{fig:s11} shows the $S_{11}$ parameter which quantifies the
portion of transmitting power that comes back to the power amplifier
due to impedance mismatch ($50~\mathrm{\Omega}$). Based on a thumb
rule~\cite{balanis2008modern}, we can assume that the antenna
impedance matches with the transceiver when, at the operating
frequency, the $S_{11}$ is less than -10~dB.  We used $S_{11}$ for
defining the antenna bandwidth because out of the range of frequencies
for which $S_{11} < -10~\mathrm{dB}$, the antenna not only does not
work properly as transducer but it could affect the physical integrity
of the final stage of the PA.

Thus, looking at Fig.~\ref{fig:s11}, a bandwidth of about 16~GHz is
enough for providing a data rate upper bound of 8~Gbps with ASK-OOK
modulation. Let us indicate with $B_W$ such bandwidth, the antenna
relative bandwidth is:
\[  B_{r}=\frac{B_W}{f_c}=\frac{16~\mathrm{GHz}}{59~\mathrm{GHz}} = 0.27 \]
where $f_c$ is the resonance frequency. Such information is useful for
determining at which resonance frequency we should design the antenna
for obtaining data rates higher than 8~Gbps, or if we are interested in
having more bandwidth for a frequency division multiplexing (FDM). For
instance, for 4~channels with a data rate of 8~Gbps, we can
design an antenna with a resonance frequency of at least:
\[  f_c=\frac{B_W}{B_r}=\frac{4 \times 16~\mathrm{GHz}}{0.27}=237~\mathrm{GHz} \]
which is obtainable by properly scaling the dimensions of the antenna
(mainly the axial length).

\begin{figure}
  \centering
  \includegraphics[width=0.35\textwidth, angle=270]{pictures/radiation.eps}
  \caption{Radiation pattern for a zigzag antenna at the horizon
    ($\phi=90^\circ$, continuous line) and at the elevation of maximum
    radiation ($\phi=35^\circ$, dashed line). $\theta=0^\circ$ is the
    direction parallel to the antenna's main axis while $\theta=90$ is
    the orthogonal direction. According to Fig.~\ref{fig:friis}, we
    assume the antenna situated upon the XY plane.}
  \label{fig:radiation}
\end{figure}
Another important result from simulation is the normalized radiation
pattern shown in Fig.~\ref{fig:radiation}. The radiation pattern is a
polar representation of the directivity represented by the term $D$ in
Eqn.~(\ref{eq:friis_complex}). As it can be observed, the best
performance is obtained when the antenna transmits or receives along
the direction of its main axis. With this information we can have an
idea of the attenuation in a particular direction
Eqn.~(\ref{eq:friis_complex}) as it will be shown in the next
subsections.

%------------------------------------------------------------------------------

\subsection{Attenuation Maps}
\begin{figure*}
  \centering
  \begin{tabular}{cccc}
    \includegraphics[width=0.22\textwidth]{pictures/pmap_c0.eps} &
    \includegraphics[width=0.22\textwidth]{pictures/pmap_c1.eps} &
    \includegraphics[width=0.22\textwidth]{pictures/pmap_c4.eps} &
    \includegraphics[width=0.22\textwidth]{pictures/pmap_c5.eps}
  \end{tabular}
  \caption{HFSS Simulation results: attenuation map ($G_a$) for the
    tiles t0, t1, t4 and t5.  The others map can be obtained
    considering the structure's symmetries.}
  \label{fig:pmap}
\end{figure*}
Let us consider a mesh based WiNoC formed by a set of $T$ tiles and a
radio hub for each tile. Let us now analyze the attenuation of the
signal transmitted by an antenna in a tile $t \in T$ as viewed by the
remaining antennas located at tiles $T \setminus \{t\}$. In the
experiments we considered $|T|=16$ in which the distance between two
antennas in the same axis is 2.5~mm.

Fig.~\ref{fig:pmap} shows the attenuation $G_a$ for a transmitting
antenna located on tile $t_0$, $t_1$, $t_4$, and $t_5$. The other
attenuation maps (\ie, the attenuations when the transmitting antenna
is located in other tiles) can be found by symmetry. In fact, the
antenna exhibits very different behavior when it is placed in
different locations within the die~\cite{gutierez_jsac09}. Thus, the
measures should be performed by considering all the possible positions
for the transmitting and receiving antenna. Fortunately, due to the
symmetrical structure of mesh-based topologies, only four measures are
needed in our case. For instance, the attenuation observed by a
receiving antenna at tile $t_{13}$ when the transmitting antenna is on
tile $t_{12}$, $G_a(t_{12},t_{3})$, is the same as observed by the
receiving antenna located on tile $t_1$ when the transmitting antenna
is on tile $t_0$, $G_a(t_{0},t_{1})$. Similarly, we have
$G_a(t_{15},t_{14})=G_a(t_{0},t_{1})$,
$G_a(t_{3},t_{2})=G_a(T_{0},t_{1})$, and so on. In addition,
$G_a(t_x,t_y)=G_a(t_y,t_x)$ for each $t_x, t_y \in T$.

As it can be observed from Fig.~\ref{fig:pmap}, the attenuation
introduced by the wireless medium does not depend only by the relative
distance between the radio hubs but it depends also by their relative
orientation. For instance, $G_a(t_0,t_3)<G_a(t_0,t_4)$ although the
distance between $t_0$ and $t_3$ is three times higher than the
distance between $t_0$ and $t_4$. This can be explained observing the
radiation pattern in Fig.~\ref{fig:radiation} in which the performance
of the antenna increases as it transmits to or receives from its main
axis direction.

In conclusion, the attenuation map is used for computing the maximum
and minimum transmitting power for guaranteeing a certain reliability
level. For the sake of example, let us consider a maximum BER of $3
\times 10^{-14}$ and a data rate of 8~Gbps. From Eqn.~(\ref{eq:pr})
we have that the power received by the receiving antenna must be
-54~dBm. From the attenuation maps shown in Fig.~\ref{fig:pmap}, the
maximum attenuation is -53~dBm. Thus, the transmitting power (which is
maximum as this is the worst case) is computed by Eqn.~(\ref{eq:pt})
as $P_{t,max} = -54 - (-53) = -1~\mathrm{dBm}$, that in linear scale
is $P_{t,max}=794~\mathrm{\mu W}$. Similarly we can compute the
minimal transmitting power. The minimum attenuation is -33~dBm, thus
$P_{t,min} = -54 - (-33) = -21~\mathrm{dBm}$, that in linear scale is
$P_{t,min}=8~\mathrm{\mu W}$.

%------------------------------------------------------------------------------

\subsection{VGA Controller Analysis}
Let us consider the transceiver proposed in~\cite{daly_jssc07}, also
used in~\cite{ditommaso_hoti11}, which has the possibility of
adjustable output power steps. For the transceiver we estimate a power
consumption of 7~mW to 23~mW for the minimum and maximum transmitting
power, respectively. They corresponding to an energy per bit ranging
from 0.42~pJ/bit to 1.4~pJ/bit.

\begin{figure}
  \centering
  \includegraphics[width=0.55\textwidth]{pictures/power.eps}
  \caption{Average power dissipated by the VGA controller for
    different power steps and different packet sizes..}
  \label{fig:vga_power}
\end{figure}
With regard to the logic of VGA controller, it has been synthesized
and evaluated by using Synopsys Design Compiler considering different
number of admissible power steps (3, 7 and 15 power steps).
Considering the gate-level implementation of the controller, the power
analysis has been performed considering various test benches varying
the size of packets.  In fact, as packet size increases, the toggle
rate of the VGA controller decreases as it is active only for the
header flit of the packet. Fig.~\ref{fig:vga_power} shows the average
power dissipation of the VGA controller for different packet size
considering a 28~nm CMOS standard cell library from TSMC operating at
1~GHz. As it can be observed, for a 10-flit packet, the average power
dissipation of the VGA controller is as low as $\mathrm{21 \ \mu W}$ for
the 3-step implementation, and about $\mathrm{50 \ \mu W}$ for the
15-step implementation.

\begin{figure}
  \centering
  \begin{tabular}{cc}
    \includegraphics[width=0.3\textwidth]{pictures/area.eps} & \includegraphics[width=0.3\textwidth]{pictures/timing.eps} \\
    (a) & (b)
    \end{tabular}
  \caption{VGA Controller synthesis results: area overhead (a) and timing results (b).}
  \label{fig:vga_timing_area}
\end{figure}
With regard to the overhead in terms of silicon area,
Fig.~\ref{fig:vga_timing_area}(a) shows such contribution for the
considered implementations. As it can be observed, the area occupation
ranges between $\mathrm{50 \ \mu m^2}$ and $\mathrm{90 \ \mu m^2}$ for
the implementations with 3 and 15 power steps, respectively. 

\begin{figure*}
  \centering
  \includegraphics[width=0.80\textwidth]{pictures/pipeline.eps}
  \caption{Pipeline of a conventional radio hub.}
  \label{fig:pipeline}
\end{figure*}
Timing results are shown in Fig.~\ref{fig:vga_timing_area}(b) in terms
of FO4. To have an idea on how the introduction of the VGA controller
impacts the delay metrics of the radio hub, let us consider the
pipeline of a typical radio hub architecture~\cite{deb_jetcas12} as
depicted in Fig.~\ref{fig:pipeline} and augmented with the proposed
VGA controller. As it can be observed, the VGA controller works in
parallel with the serializer. Thus, in terms of latency, the use of
the proposed technique does not affect the pipeline depth of the radio
hub. In terms of clock frequency, the delay introduced by the VGA
controller does not affect the critical path as 8~FO4 delay exhibited
by the 15-step implementation (the slower one among the three
considered in this paper) is less than that of other more complex
modules in the datapath (\eg, routing logic, crossbar, etc.).

\begin{figure}
  \centering
  \begin{tabular}{cc}
    \includegraphics[width=0.35\textwidth]{pictures/area_breackdown.eps} & \includegraphics[width=0.35\textwidth]{pictures/power_breackdown.eps} \\
    (a) & (b)
    \end{tabular}
  \caption{Area (a) and power (b) breakdown of the radio hub.}
  \label{fig:breakdown}
\end{figure}
Finally, Fig.~\ref{fig:breakdown} shows the area and power breakdown
of the radio hub. As it can be observed the VGA controller accounts
for a negligible fraction of the overall area and power budget which
is less than 0.05\%.


%------------------------------------------------------------------------------

\subsection{Total Energy Saving}
The effectiveness of the proposed technique is affected by the number
of power steps provided by the VGA controller. For quantifying such
impact, let us apply the proposed technique on two different WiNoC
architectures proposed in literature, namely,
iWise~\cite{ditommaso_hoti11} and
McWiNoC~\cite{ditommaso_hoti11}. Specifically, we compare the
following NoC architectures:
\begin{enumerate}
  \item Wire-line: A traditional $8 \times 8$ concentrated mesh, with
    clusters formed by 4~cores.

  \item McWiNoC: The architecture described in~\cite{zhao_nocs11} for
    a $8 \times 8$ mesh with 4~cores associated with each radio
    hub. This kind of architecture uses TDM multiplexing for the
    wireless medium. The entire bandwidth can be allocate
    for each communications due to the particular structure of the
    architecture.

  \item Proposed McWiNoC: Like McWiNoC but augmented with the proposed
    VGA controller.

  \item iWise64: The architecture described in~\cite{ditommaso_hoti11}
    in which the entire bandwidth is divided in four different channels.

  \item Proposed iWise64: Like iWise64 but augmented with the proposed
    VGA controller.
\end{enumerate}
Power data presented in the previous subsection have been used for
back-annotating a cycle accurate NoC simulator based on
Noxim~\cite{noxim} which has been extended for simulating WiNoC
architectures.

\begin{figure*}
  \centering
  \begin{tabular}{cc}
    \includegraphics[width=0.45\textwidth]{pictures/power_steps_iwise.eps} &
    \includegraphics[width=0.45\textwidth]{pictures/power_steps_mcwinoc.eps} \\
    (a) & (b)
    \end{tabular}
  \caption{Energy saving over a traditional wire-line NoC when the
    proposed VGA controller is applied on a iWise~64 architecture (a)
    and on a McWiNoC architecture (b).}
  \label{fig:power_steps}
\end{figure*}
Assuming the Wire-line NoC as baseline, Fig.~\ref{fig:power_steps}
shows the overall communication energy saving for different SPLASH-2
benchmarks when the proposed VGA controller is applied to iWise and
McWiNoC. In particular, we considered four versions of the VGA
controller, namely, 3-, 7-, 15-, and INF-step, which refer to the
considered number of power steps. Please notice that, the INF-step
version is a theoretical case (\ie, it represents an upper-bound in
terms of energy saving) in which the transmission energy is tuned in a
continuous, rather than discrete, fashion. As it can be observed, on
average, iWise and McWiNoC are 22\% and 12\% more energy efficient
than the traditional wire-line NoC. By using the proposed approach,
the average energy saving increases, on average, by 50\% and 46\% for
iWise and McWiNoC, respectively.  As expected, the number of power
steps impacts the energy saving but no relevant improvements are
observed with more than 7 power steps. For this reason, in the rest of
the experiments, we assume a VGA controller with 7 power steps if not
otherwise specified.

\begin{figure}
  \centering
  \includegraphics[width=0.45\textwidth]{pictures/mswinoc_saving.eps} \\
   \caption{Energy Saving for a 256 Core mSWNoC}
  \label{fig:results_mswinoc}
\end{figure}
Now, in order to explore the impact of the proposed scheme in mm-wave
small-world topology based WiNoCs (mSWNoC), we apply the proposed
scheme to the mSWNoC architecture presented in~\cite{deb_tc13}. Such
mSWNoC is a two levels hierarchical network where the top-level is a
mesh topology whereas the lower-level sub networks are star-rin
networks. Since the upper network is a mesh, the attenuation maps
shown in Fig.\ref{fig:pmap} can be
reused. Fig.~\ref{fig:results_mswinoc} shows energy saving for
different number of radio hubs under uniform traffic scenario. As
expected, as the number of radio hubs increases, the energy saving
increases.


%------------------------------------------------------------------------------

\subsection{Application Mapping}
The way in which tasks are mapped into the NoC has a tremendous impact
on performance and power metrics~\cite{sahu_jsa13}. In fact, the
possibility of tuning the transmitting power based on the location of
the destination node can be seen as a new degree of freedom in the
mapping problem which results in new opportunities for energy
optimization. In this subsection we assess the improvement in energy
saving when the GAMAP mapping technique~\cite{palesi_jucs12} is used
in conjunction with the proposed technique. We consider some of the
benchmarks in the SPLASH-2 benchmarks suite. The benchmarks have been
executed on Graphite Multicore Simulator~\cite{miller_hpca10} and
their communication traces have been extracted. The communication
traces are then converted to communication graphs which form the input
for the considered mapping technique.

%% Based on the formulation of the mapping problem stated in
%% Sec.~\ref{sec:mapping}, let us now present some experimental
%% results. We used the applications in the SPLASH-2 and PARSEC
%% benchmarks suites. The benchmarks have been executed on Graphite
%% Multicore Simulator~\cite{miller_hpca10} and the application graphs
%% have been extracted. Then, simulated annealing has been used to map
%% the application graph into the nodes of the network with the objective
%% of minimizing the total communication energy consumption as defined in
%% Eqn.~(\ref{eqn:mapping_energy}).
%% Based on the formulation of the mapping problem stated in
%% Sec.~\ref{sec:mapping}, simulated annealing has been used to map
%% the application graph into the nodes of the network with the objective
%% of minimizing the total communication energy consumption as defined in
%% Eqn.~(\ref{eqn:mapping_energy}).

\begin{figure*}
  \centering
  \begin{tabular}{cc}
    \includegraphics[width=0.40\textwidth]{pictures/power_saving_mapping_iwise.eps} &
    \includegraphics[width=0.40\textwidth]{pictures/power_saving_mapping_mcwinoc.eps} \\
    (a) & (b)
    \end{tabular}
  \caption{Impact of the mapping on energy consumption. Energy saving
    over a traditional wire-line NoC when the proposed VGA controller
    is applied on a iWise~64 architecture (a) and on a McWiNoC
    architecture (b).}
  \label{fig:mapping_rnd_vs_sa}
\end{figure*}
Fig.~\ref{fig:mapping_rnd_vs_sa} shows the percentage communication
energy saving (considering the wireline NoC as baseline) when the
mapping is optimized. In particular, for both iWise and McWiNoC we
analyzed four configurations as follows. 1) The proposed technique is
not applied and a random mapping is used, 2) The proposed technique is
applied and a random mapping is used, 3) The proposed technique is not
applied and the application mapping is optimized, and 4) The proposed
technique is applied and the application mapping is optimized. The
energy consumption in the case in which the random mapping is used is
measured by averaging the energy consumption over 1,000 random
mappings. As it can be observed, on average, the optimization of the
mapping in conjunction with the proposed technique improves the energy
efficiency by 72\% and 62\% for iWise and McWiNoC, respectively.

%------------------------------------------------------------------------------

\subsection{Case Study}
\begin{figure}
  \centering
  \includegraphics[width=0.45\textwidth]{pictures/case_study64.eps}
  \caption{Heterogeneous system composed by a multimedia sub-system, a
    MIMO-OFDM receiver, a PIP and a MWD module.}
  \label{fig:case_study64}
\end{figure}
Finally, as a case study, we consider a complex heterogeneous system
shown in Fig.~\ref{fig:case_study64}. The system is composed by a
generic MultiMedia System which includes a H.263 video encoder, a
H.263 video decoder, a MP3 audio encoder and a MP3 audio
decoder~\cite{hu_tcad05}, a MIMO-OFDM receiver~\cite{yoon_act06}, a
Picture-In-Picture application (PiP)~\cite{jaspers_tice99} and a
Multi-Window Display application (MWD)~\cite{vandertol_mp02}. We
have mapped the application on both iWise and McWiNoC and assessed the
energy saving when the proposed technique is used.

\begin{figure}
  \centering
  \includegraphics[width=0.40\textwidth]{pictures/results_cg.eps} \\
   \caption{Normalized energy consumption for iWise64 and McWiNoC when
     the proposed technique is applied.}
  \label{fig:results}
\end{figure}


Fig.~\ref{fig:results} shows the normalized energy consumption of the
different architectures as compared to the wireline NoC. As it can be
observed, the application of the proposed technique results in
interesting energy saving up to 50\% and 48\% when applied to iWise64
and McWiNoC, respectively.

%------------------------------------------------------------------------------

\section{Conclusions}
\label{sec:conclusions}
Emerging communication technologies like wireless NoC (WiNoC) are
considered as a viable solution for facing the scalability and the
energy consumption issues in many-core system
architectures. Unfortunately, the transceiver of the radio hub in a
WiNoC accounts for a significant fraction of the overall communication
energy budget. In this paper we have presented a reliability aware
runtime tunable transmitting power technique for improving the energy
efficiency of the transceiver in wireless NoC architectures. We have
applied the proposed technique to two known WiNoC architectures,
namely, iWise64~\cite{ditommaso_hoti11} and McWiNoC~\cite{zhao_nocs11}
observing an energy reduction up to 43\% and 60\%, respectively. 
Interesting energy saving, around 20\%, has been observed also when the proposed
technique has been applied to a specific mm-wave WiNoC, namely,
mSWNoC~\cite{deb_tc13}.
The hardware overhead, in terms of silicon area, introduced by the
proposed technique is negligible as compared to the area of the
transceiver (approx four order of magnitude less than the transceiver).

We believe that the introduction of the proposed technique opens
interesting scenarios in several directions. For instance, application
mapping strategies might take into account the specific radiation
patterns of the antenna or design space exploration techniques might
consider the orientation of the antennas as an additional degree of
freedom for application specific optimization purposes.

%------------------------------------------------------------------------------
\balance

\bibliographystyle{ACM-Reference-Format-Journals} 
\bibliography{bibliography}

%------------------------------------------------------------------------------
\end{document}
